13 research outputs found

    АналитичСский ΠΎΠ±Π·ΠΎΡ€ Π°ΡƒΠ΄ΠΈΠΎΠ²ΠΈΠ·ΡƒΠ°Π»ΡŒΠ½Ρ‹Ρ… систСм для опрСдСлСния срСдств ΠΈΠ½Π΄ΠΈΠ²ΠΈΠ΄ΡƒΠ°Π»ΡŒΠ½ΠΎΠΉ Π·Π°Ρ‰ΠΈΡ‚Ρ‹ Π½Π° Π»ΠΈΡ†Π΅ Ρ‡Π΅Π»ΠΎΠ²Π΅ΠΊΠ°

    Get PDF
    Начиная с 2019 Π³ΠΎΠ΄Π° всС страны ΠΌΠΈΡ€Π° ΡΡ‚ΠΎΠ»ΠΊΠ½ΡƒΠ»ΠΈΡΡŒ со ΡΡ‚Ρ€Π΅ΠΌΠΈΡ‚Π΅Π»ΡŒΠ½Ρ‹ΠΌ распространСниСм ΠΏΠ°Π½Π΄Π΅ΠΌΠΈΠΈ, Π²Ρ‹Π·Π²Π°Π½Π½ΠΎΠΉ коронавирусной ΠΈΠ½Ρ„Π΅ΠΊΡ†ΠΈΠ΅ΠΉ COVID-19, Π±ΠΎΡ€ΡŒΠ±Π° с ΠΊΠΎΡ‚ΠΎΡ€ΠΎΠΉ продолТаСтся ΠΌΠΈΡ€ΠΎΠ²Ρ‹ΠΌ сообщСством ΠΈ ΠΏΠΎ настоящСС врСмя. НСсмотря Π½Π° ΠΎΡ‡Π΅Π²ΠΈΠ΄Π½ΡƒΡŽ ΡΡ„Ρ„Π΅ΠΊΡ‚ΠΈΠ²Π½ΠΎΡΡ‚ΡŒ срСдств ΠΈΠ½Π΄ΠΈΠ²ΠΈΠ΄ΡƒΠ°Π»ΡŒΠ½ΠΎΠΉ Π·Π°Ρ‰ΠΈΡ‚Ρ‹ ΠΎΡ€Π³Π°Π½ΠΎΠ² дыхания ΠΎΡ‚ зараТСния коронавирусной ΠΈΠ½Ρ„Π΅ΠΊΡ†ΠΈΠ΅ΠΉ, ΠΌΠ½ΠΎΠ³ΠΈΠ΅ люди ΠΏΡ€Π΅Π½Π΅Π±Ρ€Π΅Π³Π°ΡŽΡ‚ использованиСм Π·Π°Ρ‰ΠΈΡ‚Π½Ρ‹Ρ… масок для Π»ΠΈΡ†Π° Π² общСствСнных мСстах. ΠŸΠΎΡΡ‚ΠΎΠΌΡƒ для контроля ΠΈ своСврСмСнного выявлСния Π½Π°Ρ€ΡƒΡˆΠΈΡ‚Π΅Π»Π΅ΠΉ общСствСнных ΠΏΡ€Π°Π²ΠΈΠ» здравоохранСния Π½Π΅ΠΎΠ±Ρ…ΠΎΠ΄ΠΈΠΌΠΎ ΠΏΡ€ΠΈΠΌΠ΅Π½ΡΡ‚ΡŒ соврСмСнныС ΠΈΠ½Ρ„ΠΎΡ€ΠΌΠ°Ρ†ΠΈΠΎΠ½Π½Ρ‹Π΅ Ρ‚Π΅Ρ…Π½ΠΎΠ»ΠΎΠ³ΠΈΠΈ, ΠΊΠΎΡ‚ΠΎΡ€Ρ‹Π΅ Π±ΡƒΠ΄ΡƒΡ‚ Π΄Π΅Ρ‚Π΅ΠΊΡ‚ΠΈΡ€ΠΎΠ²Π°Ρ‚ΡŒ Π·Π°Ρ‰ΠΈΡ‚Π½Ρ‹Π΅ маски Π½Π° Π»ΠΈΡ†Π°Ρ… людСй ΠΏΠΎ Π²ΠΈΠ΄Π΅ΠΎ- ΠΈ Π°ΡƒΠ΄ΠΈΠΎΠΈΠ½Ρ„ΠΎΡ€ΠΌΠ°Ρ†ΠΈΠΈ. Π’ ΡΡ‚Π°Ρ‚ΡŒΠ΅ ΠΏΡ€ΠΈΠ²Π΅Π΄Π΅Π½ аналитичСский ΠΎΠ±Π·ΠΎΡ€ ΡΡƒΡ‰Π΅ΡΡ‚Π²ΡƒΡŽΡ‰ΠΈΡ… ΠΈ Ρ€Π°Π·Ρ€Π°Π±Π°Ρ‚Ρ‹Π²Π°Π΅ΠΌΡ‹Ρ… ΠΈΠ½Ρ‚Π΅Π»Π»Π΅ΠΊΡ‚ΡƒΠ°Π»ΡŒΠ½Ρ‹Ρ… ΠΈΠ½Ρ„ΠΎΡ€ΠΌΠ°Ρ†ΠΈΠΎΠ½Π½Ρ‹Ρ… Ρ‚Π΅Ρ…Π½ΠΎΠ»ΠΎΠ³ΠΈΠΉ бимодального Π°Π½Π°Π»ΠΈΠ·Π° голосовых ΠΈ Π»ΠΈΡ†Π΅Π²Ρ‹Ρ… характСристик Ρ‡Π΅Π»ΠΎΠ²Π΅ΠΊΠ° Π² маскС. БущСствуСт ΠΌΠ½ΠΎΠ³ΠΎ исслСдований Π½Π° Ρ‚Π΅ΠΌΡƒ обнаруТСния масок ΠΏΠΎ видСоизобраТСниям, Ρ‚Π°ΠΊΠΆΠ΅ Π² ΠΎΡ‚ΠΊΡ€Ρ‹Ρ‚ΠΎΠΌ доступС ΠΌΠΎΠΆΠ½ΠΎ Π½Π°ΠΉΡ‚ΠΈ Π·Π½Π°Ρ‡ΠΈΡ‚Π΅Π»ΡŒΠ½ΠΎΠ΅ количСство корпусов, содСрТащих изобраТСния Π»ΠΈΡ† ΠΊΠ°ΠΊ Π±Π΅Π· масок, Ρ‚Π°ΠΊ ΠΈ Π² масках, ΠΏΠΎΠ»ΡƒΡ‡Π΅Π½Π½Ρ‹Ρ… Ρ€Π°Π·Π»ΠΈΡ‡Π½Ρ‹ΠΌΠΈ способами. ИсслСдований ΠΈ Ρ€Π°Π·Ρ€Π°Π±ΠΎΡ‚ΠΎΠΊ, Π½Π°ΠΏΡ€Π°Π²Π»Π΅Π½Π½Ρ‹Ρ… Π½Π° Π΄Π΅Ρ‚Π΅ΠΊΡ‚ΠΈΡ€ΠΎΠ²Π°Π½ΠΈΠ΅ срСдств ΠΈΠ½Π΄ΠΈΠ²ΠΈΠ΄ΡƒΠ°Π»ΡŒΠ½ΠΎΠΉ Π·Π°Ρ‰ΠΈΡ‚Ρ‹ ΠΎΡ€Π³Π°Π½ΠΎΠ² дыхания ΠΏΠΎ акустичСским характСристикам Ρ€Π΅Ρ‡ΠΈ Ρ‡Π΅Π»ΠΎΠ²Π΅ΠΊΠ° ΠΏΠΎΠΊΠ° достаточно ΠΌΠ°Π»ΠΎ, Ρ‚Π°ΠΊ ΠΊΠ°ΠΊ это Π½Π°ΠΏΡ€Π°Π²Π»Π΅Π½ΠΈΠ΅ Π½Π°Ρ‡Π°Π»ΠΎ Ρ€Π°Π·Π²ΠΈΠ²Π°Ρ‚ΡŒΡΡ Ρ‚ΠΎΠ»ΡŒΠΊΠΎ Π² ΠΏΠ΅Ρ€ΠΈΠΎΠ΄ ΠΏΠ°Π½Π΄Π΅ΠΌΠΈΠΈ, Π²Ρ‹Π·Π²Π°Π½Π½ΠΎΠΉ коронавирусной ΠΈΠ½Ρ„Π΅ΠΊΡ†ΠΈΠ΅ΠΉ COVID-19. Π‘ΡƒΡ‰Π΅ΡΡ‚Π²ΡƒΡŽΡ‰ΠΈΠ΅ систСмы ΠΏΠΎΠ·Π²ΠΎΠ»ΡΡŽΡ‚ ΠΏΡ€Π΅Π΄ΠΎΡ‚Π²Ρ€Π°Ρ‚ΠΈΡ‚ΡŒ распространСниС коронавирусной ΠΈΠ½Ρ„Π΅ΠΊΡ†ΠΈΠΈ с ΠΏΠΎΠΌΠΎΡ‰ΡŒΡŽ распознавания наличия/отсутствия масок Π½Π° Π»ΠΈΡ†Π΅, Ρ‚Π°ΠΊΠΆΠ΅ Π΄Π°Π½Π½Ρ‹Π΅ систСмы ΠΏΠΎΠΌΠΎΠ³Π°ΡŽΡ‚ Π² дистанционном диагностировании COVID-19 с ΠΏΠΎΠΌΠΎΡ‰ΡŒΡŽ обнаруТСния ΠΏΠ΅Ρ€Π²Ρ‹Ρ… симптомов вирусной ΠΈΠ½Ρ„Π΅ΠΊΡ†ΠΈΠΈ ΠΏΠΎ акустичСским характСристикам. Однако, Π½Π° сСгодняшний дСнь сущСствуСт ряд Π½Π΅Ρ€Π΅ΡˆΠ΅Π½Π½Ρ‹Ρ… ΠΏΡ€ΠΎΠ±Π»Π΅ΠΌ Π² области автоматичСского диагностирования симптомов COVID-19 ΠΈ наличия/отсутствия масок Π½Π° Π»ΠΈΡ†Π°Ρ… людСй. Π’ ΠΏΠ΅Ρ€Π²ΡƒΡŽ ΠΎΡ‡Π΅Ρ€Π΅Π΄ΡŒ это низкая Ρ‚ΠΎΡ‡Π½ΠΎΡΡ‚ΡŒ обнаруТСния масок ΠΈ коронавирусной ΠΈΠ½Ρ„Π΅ΠΊΡ†ΠΈΠΈ, Ρ‡Ρ‚ΠΎ Π½Π΅ позволяСт ΠΎΡΡƒΡ‰Π΅ΡΡ‚Π²Π»ΡΡ‚ΡŒ Π°Π²Ρ‚ΠΎΠΌΠ°Ρ‚ΠΈΡ‡Π΅ΡΠΊΡƒΡŽ диагностику Π±Π΅Π· присутствия экспСртов (мСдицинского пСрсонала). МногиС систСмы Π½Π΅ способны Ρ€Π°Π±ΠΎΡ‚Π°Ρ‚ΡŒ Π² Ρ€Π΅ΠΆΠΈΠΌΠ΅ Ρ€Π΅Π°Π»ΡŒΠ½ΠΎΠ³ΠΎ Π²Ρ€Π΅ΠΌΠ΅Π½ΠΈ, ΠΈΠ·-Π·Π° Ρ‡Π΅Π³ΠΎ Π½Π΅Π²ΠΎΠ·ΠΌΠΎΠΆΠ½ΠΎ ΠΏΡ€ΠΎΠΈΠ·Π²ΠΎΠ΄ΠΈΡ‚ΡŒ ΠΊΠΎΠ½Ρ‚Ρ€ΠΎΠ»ΡŒ ΠΈ ΠΌΠΎΠ½ΠΈΡ‚ΠΎΡ€ΠΈΠ½Π³ ношСния Π·Π°Ρ‰ΠΈΡ‚Π½Ρ‹Ρ… масок Π² общСствСнных мСстах. Π’Π°ΠΊΠΆΠ΅ Π±ΠΎΠ»ΡŒΡˆΠΈΠ½ΡΡ‚Π²ΠΎ ΡΡƒΡ‰Π΅ΡΡ‚Π²ΡƒΡŽΡ‰ΠΈΡ… систСм Π½Π΅Π²ΠΎΠ·ΠΌΠΎΠΆΠ½ΠΎ Π²ΡΡ‚Ρ€ΠΎΠΈΡ‚ΡŒ Π² смартфон, Ρ‡Ρ‚ΠΎΠ±Ρ‹ ΠΏΠΎΠ»ΡŒΠ·ΠΎΠ²Π°Ρ‚Π΅Π»ΠΈ ΠΌΠΎΠ³Π»ΠΈ Π² любом мСстС произвСсти диагностированиС наличия коронавирусной ΠΈΠ½Ρ„Π΅ΠΊΡ†ΠΈΠΈ. Π•Ρ‰Π΅ ΠΎΠ΄Π½ΠΎΠΉ основной ΠΏΡ€ΠΎΠ±Π»Π΅ΠΌΠΎΠΉ являСтся сбор Π΄Π°Π½Π½Ρ‹Ρ… ΠΏΠ°Ρ†ΠΈΠ΅Π½Ρ‚ΠΎΠ², Π·Π°Ρ€Π°ΠΆΠ΅Π½Π½Ρ‹Ρ… COVID-19, Ρ‚Π°ΠΊ ΠΊΠ°ΠΊ ΠΌΠ½ΠΎΠ³ΠΈΠ΅ люди Π½Π΅ согласны Ρ€Π°ΡΠΏΡ€ΠΎΡΡ‚Ρ€Π°Π½ΡΡ‚ΡŒ ΠΊΠΎΠ½Ρ„ΠΈΠ΄Π΅Π½Ρ†ΠΈΠ°Π»ΡŒΠ½ΡƒΡŽ ΠΈΠ½Ρ„ΠΎΡ€ΠΌΠ°Ρ†ΠΈΡŽ

    Анализ ΠΈΠ½Ρ„ΠΎΡ€ΠΌΠ°Ρ†ΠΈΠΎΠ½Π½ΠΎΠ³ΠΎ ΠΈ матСматичСского обСспСчСния для распознавания Π°Ρ„Ρ„Π΅ΠΊΡ‚ΠΈΠ²Π½Ρ‹Ρ… состояний Ρ‡Π΅Π»ΠΎΠ²Π΅ΠΊΠ°

    Get PDF
    Π’ ΡΡ‚Π°Ρ‚ΡŒΠ΅ прСдставлСн аналитичСский ΠΎΠ±Π·ΠΎΡ€ исслСдований Π² области Π°Ρ„Ρ„Π΅ΠΊΡ‚ΠΈΠ²Π½Ρ‹Ρ… вычислСний. Π­Ρ‚ΠΎ Π½Π°ΠΏΡ€Π°Π²Π»Π΅Π½ΠΈΠ΅ являСтся ΡΠΎΡΡ‚Π°Π²Π»ΡΡŽΡ‰Π΅ΠΉ искусствСнного ΠΈΠ½Ρ‚Π΅Π»Π»Π΅ΠΊΡ‚Π°, ΠΈ ΠΈΠ·ΡƒΡ‡Π°Π΅Ρ‚ ΠΌΠ΅Ρ‚ΠΎΠ΄Ρ‹, Π°Π»Π³ΠΎΡ€ΠΈΡ‚ΠΌΡ‹ ΠΈ систСмы для Π°Π½Π°Π»ΠΈΠ·Π° Π°Ρ„Ρ„Π΅ΠΊΡ‚ΠΈΠ²Π½Ρ‹Ρ… состояний Ρ‡Π΅Π»ΠΎΠ²Π΅ΠΊΠ° ΠΏΡ€ΠΈ Π΅Π³ΠΎ взаимодСйствии с Π΄Ρ€ΡƒΠ³ΠΈΠΌΠΈ людьми, ΠΊΠΎΠΌΠΏΡŒΡŽΡ‚Π΅Ρ€Π½Ρ‹ΠΌΠΈ систСмами ΠΈΠ»ΠΈ Ρ€ΠΎΠ±ΠΎΡ‚Π°ΠΌΠΈ. Π’ области ΠΈΠ½Ρ‚Π΅Π»Π»Π΅ΠΊΡ‚ΡƒΠ°Π»ΡŒΠ½ΠΎΠ³ΠΎ Π°Π½Π°Π»ΠΈΠ·Π° Π΄Π°Π½Π½Ρ‹Ρ… ΠΏΠΎΠ΄ Π°Ρ„Ρ„Π΅ΠΊΡ‚ΠΎΠΌ подразумСваСтся проявлСниС психологичСских Ρ€Π΅Π°ΠΊΡ†ΠΈΠΉ Π½Π° Π²ΠΎΠ·Π±ΡƒΠΆΠ΄Π°Π΅ΠΌΠΎΠ΅ событиС, ΠΊΠΎΡ‚ΠΎΡ€ΠΎΠ΅ ΠΌΠΎΠΆΠ΅Ρ‚ ΠΏΡ€ΠΎΡ‚Π΅ΠΊΠ°Ρ‚ΡŒ ΠΊΠ°ΠΊ Π² краткосрочном, Ρ‚Π°ΠΊ ΠΈ Π² долгосрочном ΠΏΠ΅Ρ€ΠΈΠΎΠ΄Π΅, Π° Ρ‚Π°ΠΊΠΆΠ΅ ΠΈΠΌΠ΅Ρ‚ΡŒ Ρ€Π°Π·Π»ΠΈΡ‡Π½ΡƒΡŽ ΠΈΠ½Ρ‚Π΅Π½ΡΠΈΠ²Π½ΠΎΡΡ‚ΡŒ ΠΏΠ΅Ρ€Π΅ΠΆΠΈΠ²Π°Π½ΠΈΠΉ. АффСкты Π² рассматриваСмой области Ρ€Π°Π·Π΄Π΅Π»Π΅Π½Ρ‹ Π½Π° 4 Π²ΠΈΠ΄Π°: Π°Ρ„Ρ„Π΅ΠΊΡ‚ΠΈΠ²Π½Ρ‹Π΅ эмоции, Π±Π°Π·ΠΎΠ²Ρ‹Π΅ эмоции, настроСниС ΠΈ Π°Ρ„Ρ„Π΅ΠΊΡ‚ΠΈΠ²Π½Ρ‹Π΅ расстройства. ΠŸΡ€ΠΎΡΠ²Π»Π΅Π½ΠΈΠ΅ Π°Ρ„Ρ„Π΅ΠΊΡ‚ΠΈΠ²Π½Ρ‹Ρ… состояний отраТаСтся Π² Π²Π΅Ρ€Π±Π°Π»ΡŒΠ½Ρ‹Ρ… Π΄Π°Π½Π½Ρ‹Ρ… ΠΈ Π½Π΅Π²Π΅Ρ€Π±Π°Π»ΡŒΠ½Ρ‹Ρ… характСристиках повСдСния: акустичСских ΠΈ лингвистичСских характСристиках Ρ€Π΅Ρ‡ΠΈ, ΠΌΠΈΠΌΠΈΠΊΠ΅, ТСстах ΠΈ ΠΏΠΎΠ·Π°Ρ… Ρ‡Π΅Π»ΠΎΠ²Π΅ΠΊΠ°. Π’ ΠΎΠ±Π·ΠΎΡ€Π΅ приводится ΡΡ€Π°Π²Π½ΠΈΡ‚Π΅Π»ΡŒΠ½Ρ‹ΠΉ Π°Π½Π°Π»ΠΈΠ· ΡΡƒΡ‰Π΅ΡΡ‚Π²ΡƒΡŽΡ‰Π΅Π³ΠΎ ΠΈΠ½Ρ„ΠΎΡ€ΠΌΠ°Ρ†ΠΈΠΎΠ½Π½ΠΎΠ³ΠΎ обСспСчСния для автоматичСского распознавания Π°Ρ„Ρ„Π΅ΠΊΡ‚ΠΈΠ²Π½Ρ‹Ρ… состояний Ρ‡Π΅Π»ΠΎΠ²Π΅ΠΊΠ° Π½Π° ΠΏΡ€ΠΈΠΌΠ΅Ρ€Π΅ эмоций, сСнтимСнта, агрСссии ΠΈ дСпрСссии. НСмногочислСнныС русскоязычныС Π°Ρ„Ρ„Π΅ΠΊΡ‚ΠΈΠ²Π½Ρ‹Π΅ Π±Π°Π·Ρ‹ Π΄Π°Π½Π½Ρ‹Ρ… ΠΏΠΎΠΊΠ° сущСствСнно ΡƒΡΡ‚ΡƒΠΏΠ°ΡŽΡ‚ ΠΏΠΎ ΠΎΠ±ΡŠΠ΅ΠΌΡƒ ΠΈ качСству элСктронным рСсурсам Π½Π° Π΄Ρ€ΡƒΠ³ΠΈΡ… ΠΌΠΈΡ€ΠΎΠ²Ρ‹Ρ… языках, Ρ‡Ρ‚ΠΎ обуславливаСт Π½Π΅ΠΎΠ±Ρ…ΠΎΠ΄ΠΈΠΌΠΎΡΡ‚ΡŒ рассмотрСния ΡˆΠΈΡ€ΠΎΠΊΠΎΠ³ΠΎ спСктра Π΄ΠΎΠΏΠΎΠ»Π½ΠΈΡ‚Π΅Π»ΡŒΠ½Ρ‹Ρ… ΠΏΠΎΠ΄Ρ…ΠΎΠ΄ΠΎΠ², ΠΌΠ΅Ρ‚ΠΎΠ΄ΠΎΠ² ΠΈ Π°Π»Π³ΠΎΡ€ΠΈΡ‚ΠΌΠΎΠ², примСняСмых Π² условиях ΠΎΠ³Ρ€Π°Π½ΠΈΡ‡Π΅Π½Π½ΠΎΠ³ΠΎ объСма ΠΎΠ±ΡƒΡ‡Π°ΡŽΡ‰ΠΈΡ… ΠΈ тСстовых Π΄Π°Π½Π½Ρ‹Ρ…, ΠΈ ставит Π·Π°Π΄Π°Ρ‡Ρƒ Ρ€Π°Π·Ρ€Π°Π±ΠΎΡ‚ΠΊΠΈ Π½ΠΎΠ²Ρ‹Ρ… ΠΏΠΎΠ΄Ρ…ΠΎΠ΄ΠΎΠ² ΠΊ Π°ΡƒΠ³ΠΌΠ΅Π½Ρ‚Π°Ρ†ΠΈΠΈ Π΄Π°Π½Π½Ρ‹Ρ…, пСрСносу обучСния ΠΌΠΎΠ΄Π΅Π»Π΅ΠΉ ΠΈ Π°Π΄Π°ΠΏΡ‚Π°Ρ†ΠΈΠΈ иноязычных рСсурсов. Π’ ΡΡ‚Π°Ρ‚ΡŒΠ΅ приводится описаниС ΠΌΠ΅Ρ‚ΠΎΠ΄ΠΎΠ² Π°Π½Π°Π»ΠΈΠ·Π° одномодальной Π²ΠΈΠ·ΡƒΠ°Π»ΡŒΠ½ΠΎΠΉ, акустичСской ΠΈ лингвистичСской ΠΈΠ½Ρ„ΠΎΡ€ΠΌΠ°Ρ†ΠΈΠΈ, Π° Ρ‚Π°ΠΊΠΆΠ΅ ΠΌΠ½ΠΎΠ³ΠΎΠΌΠΎΠ΄Π°Π»ΡŒΠ½Ρ‹Ρ… ΠΏΠΎΠ΄Ρ…ΠΎΠ΄ΠΎΠ² ΠΊ Ρ€Π°ΡΠΏΠΎΠ·Π½Π°Π²Π°Π½ΠΈΡŽ Π°Ρ„Ρ„Π΅ΠΊΡ‚ΠΈΠ²Π½Ρ‹Ρ… состояний. ΠœΠ½ΠΎΠ³ΠΎΠΌΠΎΠ΄Π°Π»ΡŒΠ½Ρ‹ΠΉ ΠΏΠΎΠ΄Ρ…ΠΎΠ΄ ΠΊ автоматичСскому Π°Π½Π°Π»ΠΈΠ·Ρƒ Π°Ρ„Ρ„Π΅ΠΊΡ‚ΠΈΠ²Π½Ρ‹Ρ… состояний позволяСт ΠΏΠΎΠ²Ρ‹ΡΠΈΡ‚ΡŒ Ρ‚ΠΎΡ‡Π½ΠΎΡΡ‚ΡŒ распознавания рассматриваСмых явлСний ΠΎΡ‚Π½ΠΎΡΠΈΡ‚Π΅Π»ΡŒΠ½ΠΎ ΠΎΠ΄Π½ΠΎΠΌΠΎΠ΄Π°Π»ΡŒΠ½Ρ‹Ρ… Ρ€Π΅ΡˆΠ΅Π½ΠΈΠΉ. Π’ ΠΎΠ±Π·ΠΎΡ€Π΅ ΠΎΡ‚ΠΌΠ΅Ρ‡Π΅Π½Π° тСндСнция соврСмСнных исслСдований, Π·Π°ΠΊΠ»ΡŽΡ‡Π°ΡŽΡ‰Π°ΡΡΡ Π² Ρ‚ΠΎΠΌ, Ρ‡Ρ‚ΠΎ нСйросСтСвыС ΠΌΠ΅Ρ‚ΠΎΠ΄Ρ‹ постСпСнно Π²Ρ‹Ρ‚Π΅ΡΠ½ΡΡŽΡ‚ классичСскиС Π΄Π΅Ρ‚Π΅Ρ€ΠΌΠΈΠ½ΠΈΡ€ΠΎΠ²Π°Π½Π½Ρ‹Π΅ ΠΌΠ΅Ρ‚ΠΎΠ΄Ρ‹ благодаря Π»ΡƒΡ‡ΡˆΠ΅ΠΌΡƒ качСству распознавания состояний ΠΈ ΠΎΠΏΠ΅Ρ€Π°Ρ‚ΠΈΠ²Π½ΠΎΠΉ ΠΎΠ±Ρ€Π°Π±ΠΎΡ‚ΠΊΠ΅ большого объСма Π΄Π°Π½Π½Ρ‹Ρ…. Π’ ΡΡ‚Π°Ρ‚ΡŒΠ΅ Ρ€Π°ΡΡΠΌΠ°Ρ‚Ρ€ΠΈΠ²Π°ΡŽΡ‚ΡΡ ΠΌΠ΅Ρ‚ΠΎΠ΄Ρ‹ Π°Π½Π°Π»ΠΈΠ·Π° Π°Ρ„Ρ„Π΅ΠΊΡ‚ΠΈΠ²Π½Ρ‹Ρ… состояний. ΠŸΡ€Π΅ΠΈΠΌΡƒΡ‰Π΅ΡΡ‚Π²ΠΎΠΌ использования ΠΌΠ½ΠΎΠ³ΠΎΠ·Π°Π΄Π°Ρ‡Π½Ρ‹Ρ… иСрархичСских ΠΏΠΎΠ΄Ρ…ΠΎΠ΄ΠΎΠ² являСтся Π²ΠΎΠ·ΠΌΠΎΠΆΠ½ΠΎΡΡ‚ΡŒ ΠΈΠ·Π²Π»Π΅ΠΊΠ°Ρ‚ΡŒ Π½ΠΎΠ²Ρ‹Π΅ Ρ‚ΠΈΠΏΡ‹ Π·Π½Π°Π½ΠΈΠΉ, Π² Ρ‚ΠΎΠΌ числС ΠΎ влиянии, коррСляции ΠΈ взаимодСйствии Π½Π΅ΡΠΊΠΎΠ»ΡŒΠΊΠΈΡ… Π°Ρ„Ρ„Π΅ΠΊΡ‚ΠΈΠ²Π½Ρ‹Ρ… состояний Π΄Ρ€ΡƒΠ³ Π½Π° Π΄Ρ€ΡƒΠ³Π°, Ρ‡Ρ‚ΠΎ ΠΏΠΎΡ‚Π΅Π½Ρ†ΠΈΠ°Π»ΡŒΠ½ΠΎ Π²Π»Π΅Ρ‡Π΅Ρ‚ ΠΊ ΡƒΠ»ΡƒΡ‡ΡˆΠ΅Π½ΠΈΡŽ качСства распознавания. ΠŸΡ€ΠΈΠ²ΠΎΠ΄ΡΡ‚ΡΡ ΠΏΠΎΡ‚Π΅Π½Ρ†ΠΈΠ°Π»ΡŒΠ½Ρ‹Π΅ трСбования ΠΊ Ρ€Π°Π·Ρ€Π°Π±Π°Ρ‚Ρ‹Π²Π°Π΅ΠΌΡ‹ΠΌ систСмам Π°Π½Π°Π»ΠΈΠ·Π° Π°Ρ„Ρ„Π΅ΠΊΡ‚ΠΈΠ²Π½Ρ‹Ρ… состояний ΠΈ основныС направлСния Π΄Π°Π»ΡŒΠ½Π΅ΠΉΡˆΠΈΡ… исслСдований

    Multi-Corpus Learning for Audio–Visual Emotions and Sentiment Recognition

    No full text
    Recognition of emotions and sentiment (affective states) from human audio–visual information is widely used in healthcare, education, entertainment, and other fields; therefore, it has become a highly active research area. The large variety of corpora with heterogeneous data available for the development of single-corpus approaches for recognition of affective states may lead to approaches trained on one corpus being less effective on another. In this article, we propose a multi-corpus learned audio–visual approach for emotion and sentiment recognition. It is based on the extraction of mid-level features at the segment level using two multi-corpus temporal models (a pretrained transformer with GRU layers for the audio modality and pre-trained 3D CNN with BiLSTM-Former for the video modality) and on predicting affective states using two single-corpus cross-modal gated self-attention fusion (CMGSAF) models. The proposed approach was tested on the RAMAS and CMU-MOSEI corpora. To date, our approach has outperformed state-of-the-art audio–visual approaches for emotion recognition by 18.2% (78.1% vs. 59.9%) for the CMU-MOSEI corpus in terms of the Weighted Accuracy and by 0.7% (82.8% vs. 82.1%) for the RAMAS corpus in terms of the Unweighted Average Recall

    Combining Clustering and Functionals based Acoustic Feature Representations for Classification of Baby Sounds

    No full text
    This paper investigates different fusion strategies as well as provides insights on their effectiveness alongside standalone classifiers in the framework of paralinguistic analysis of infant vocalizations. The combinations of such systems as Support Vector Machines (SVM) and Extreme Learning Machines (ELM) based classifiers, as well as its weighted kernel version are explored, training systems on different acoustic feature representations and implementing weighted score-level fusion of the predictions. The proposed framework is tested on INTERSPEECH ComParE-2019 Baby Sounds corpus, which is a collection of Home Bank infant vocalization corpora annotated for five classes. Adhering to the challenge protocol, using a single test set submission we outperform the challenge baseline Unweighted Average Recall (UAR) score and achieve a comparable result to the state-of-the-art

    Complex Paralinguistic Analysis of Speech: Predicting Gender, Emotions and Deception in a Hierarchical Framework

    No full text
    In this paper, we present a hierarchical framework for complex paralinguistic analysis of speech including gender, emotions and deception recognition. The main idea of the framework is built upon the research on interrelation between various paralinguistic phenomena. It uses gender information to predict emotional states, and the outcome of the emotion recognition to predict the truthfulness of the speech. We use multiple datasets (aGender, Ruslana, EmoDB and DSD) to perform within-corpus and cross-corpus experiments using various performance measures. The experimental results reveal that gender-specific models improve the effectiveness of automatic speech emotion recognition in terms of Unweighted Average Recall up to an absolute 5.7%, and the integration of emotion predictions improves the F-score of automatic deception detection compared to our baseline by an absolute 4.7%. The obtained cross-validation results of 88.4 +/- 1.5% for deception detection beat the existing state-of-the-art by an absolute 2.8%

    Multimodal Personality Traits Assessment (MuPTA) Corpus: The Impact of Spontaneous and Read Speech

    No full text
    Automatic personality traits assessment (PTA) provides high-level, intelligible predictive inputs for subsequent critical downstream tasks, such as job interview recommendations and mental healthcare monitoring. In this work, we introduce a novel Multimodal Personality Traits Assessment (MuPTA) corpus. Our MuPTA corpus is unique in that it contains both spontaneous and read speech collected in the midly-resourced Russian language. We present a novel audio-visual approach for PTA that is used in order to set up baseline results on this corpus. We further analyze the impact of both spontaneous and read speech types on the PTA predictive performance. We find that for the audio modality, the PTA predictive performances on short signals are almost equal regardless of the speech type, while PTA using video modality is more accurate with spontaneous speech compared to read one regardless of the signal length

    End-to-End Modeling and Transfer Learning for Audiovisual Emotion Recognition in-the-Wild

    Get PDF
    As emotions play a central role in human communication, automatic emotion recognition has attracted increasing attention in the last two decades. While multimodal systems enjoy high performances on lab-controlled data, they are still far from providing ecological validity on non-lab-controlled, namely β€œin-the-wild” data. This work investigates audiovisual deep learning approaches to emotion recognition in in-the-wild problem. Inspired by the outstanding performance of end-to-end and transfer learning techniques, we explored the effectiveness of architectures in which a modality-specific Convolutional Neural Network (CNN) is followed by a Long Short-Term Memory Recurrent Neural Network (LSTM-RNN) using the AffWild2 dataset under the Affective Behavior Analysis in-the-Wild (ABAW) challenge protocol. We deployed unimodal end-to-end and transfer learning approaches within a multimodal fusion system, which generated final predictions using a weighted score fusion scheme. Exploiting the proposed deep-learning-based multimodal system, we reached a test set challenge performance measure of 48.1% on the ABAW 2020 Facial Expressions challenge, which advances the first-runner-up performance

    Ensembling End-to-End Deep Models for Computational Paralinguistics Tasks: ComParE 2020 Mask and Breathing Sub-Challenges

    No full text
    This paper describes deep learning approaches for the Mask and Breathing Sub-Challenges (SCs), which are addressed by the INTERSPEECH 2020 Computational Paralinguistics Challenge. Motivated by outstanding performance of state-of-the-art end-to-end (E2E) approaches, we explore and compare effectiveness of different deep Convolutional Neural Network (CNN) architectures on raw data, log Mel-spectrograms, and Mel-Frequency Cepstral Coefficients. We apply a transfer learning approach to improve model’s efficiency and convergence speed. In the Mask SC, we conduct experiments with several pretrained CNN architectures on log-Mel spectrograms, as well as Support Vector Machines on baseline features. For the Breathing SC, we propose an ensemble deep learning system that exploits E2E learning and sequence prediction. The E2E model is based on 1D CNN operating on raw speech signals and is coupled with Long Short-Term Memory layers for sequence modeling. The second model works with log-Mel features and is based on a pretrained 2D CNN model stacked to Gated Recurrent Unit layers. To increase performance of our models in both SCs, we use ensembles of the best deep neural models obtained from N-fold cross-validation on combined challenge training and development datasets. Our results markedly outperform the challenge test set baselines in both SCs

    End-to-End Modeling and Transfer Learning for Audiovisual Emotion Recognition in-the-Wild

    Get PDF
    As emotions play a central role in human communication, automatic emotion recognition has attracted increasing attention in the last two decades. While multimodal systems enjoy high performances on lab-controlled data, they are still far from providing ecological validity on non-lab-controlled, namely β€œin-the-wild” data. This work investigates audiovisual deep learning approaches to emotion recognition in in-the-wild problem. Inspired by the outstanding performance of end-to-end and transfer learning techniques, we explored the effectiveness of architectures in which a modality-specific Convolutional Neural Network (CNN) is followed by a Long Short-Term Memory Recurrent Neural Network (LSTM-RNN) using the AffWild2 dataset under the Affective Behavior Analysis in-the-Wild (ABAW) challenge protocol. We deployed unimodal end-to-end and transfer learning approaches within a multimodal fusion system, which generated final predictions using a weighted score fusion scheme. Exploiting the proposed deep-learning-based multimodal system, we reached a test set challenge performance measure of 48.1% on the ABAW 2020 Facial Expressions challenge, which advances the first-runner-up performance
    corecore